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Location Identification of Power Line Outages 
Using PMU Measurements with Bad Data 

Wen-Tai Li, Chao-Kai Wen, Jung-Chieh Chen, Kai-Kit Wong, Jen-Hao Teng, and Chau Yuen 


Abstract —The use of phasor angle measurements provided by 
phasor measurement units (PMUs) in fault detection is regarded 
as a promising method in identifying locations of power line 
outages. However, communication errors or system malfunctions 
may introduce errors to the measurements and thus yield bad 
data. Most of the existing methods on line outage identification 
fail to consider such error. This paper develops a framework 
for identifying multiple power line outages based on the PMUs’ 
measurements in the presence of bad data. In particular, we 
design an algorithm to identify locations of line outage and 
recover the faulty measurements simultaneously. The proposed 
algorithm does not require any prior information on the number 
of line outages and the noise variance. Case studies carried out 
on test systems of different sizes validate the effectiveness and 
efficiency of the proposed approach. 


1. Introduction 

OWER LINE outage identification is of paramount im¬ 
portance for maintaining reliable and secure operation 
of electric power systems. When outages occur on power 
transmission lines, certain lines may become overloaded and 
consequently fail. Shortly thereafter, further cascading failures 
may result in system collapse. Therefore, a power system op¬ 
erator must accurately identify line outages promptly. Modern 
wide-area measurement system (WAMS), which builds upon 
phasor measurement units (PMUs) and fast communication 
links, is considered as a promising infrastructure for supporting 
fast line outage detection [1]. 

Several techniques for power line outage detec¬ 
tion/identification based on PMUs’ measurements have 
been investigated recently [2-11]. In particular, Tate and 
Overbye in [2,3] proposed identification algorithms for single 
and double outage lines, respectively. The idea is to find a line 
combination so that the pre-computed phasor angle difference 
corresponding to that line outage event can match with the 
observed one. This methodology was further extended to 
accommodate islanding in [4]. Zhu and Giannakis in [8] 
then used sparse configurations and proposed a compressed 
sensing based algorithm for identifying multiple line outages. 
Eollowing [8], Chen et al. [9] proposed an improvement 
method adopting a cross-entropy-based global optimization 
technique, and Zhao and Song [10] presented a distributed 

W. T. Li, C. K. Wen, and J. H. Teng are with the Department of Electronic 
and Electrical Engineering, National Sun Yat-sen University, Kaohsiung 804, 
Taiwan. E-mail: chaokai.wen@mail.ee.nsysu.edu.tw. 

J. C. Chen is with the Department of Optoelectronics and Communication 
Engineering, National Kaohsiung Normal University, Kaohsiung 802, Taiwan. 

K. Wong is with the Department of Electronic and Electrical Engineering, 
University College London, London, United Kingdom. 

C. Yuen is with Engineering Product Development, Singapore University 
of Technology and Design, Singapore. 


framework to perform the identification locally at each phasor 
data concentrator. Most recently, Wu et al. [11] considered 
the same problem under scenarios with a limited number of 
PMUs. 

These existing studies all rely on high accuracy of phasor 
angle measurements (or perfect PMUs). Although compared 
with traditional meters, PMUs are more robust against mea¬ 
surement errors, communication errors or system malfunctions 
may introduce errors to the measurements received by phasor 
data concentrators. In addition, it is very likely that certain 
physical impact on power system buses result in line outages 
and would subsequently lead to faulty PMUs. In these sce¬ 
narios, a few of the phasor angle measurements may contain 
errors, which are referred to as bad data. Note that bad data 
are different from the common small additive noises resulting 
from certain uncertainties of PMUs (e.g., the A/D converters 
and instrument transformers). The uncertainties of PMUs are 
usually modeled as unstructured noises, whose effects have 
been investigated in [12] for line outage identification, while 
bad data in the phasor angle vector lie in the range space of 
the susceptance matrix, which can arbitrarily perturb results 
of line outage identification. 

Although the accuracy of line outage identification in the 
presence of bad data is expected to be degraded, a compre¬ 
hensive study on this issue is missing. In this paper, we take 
the important first step to develop a framework for line outage 
identification based on phasor angle measurements with bad 
data. Our contribution is threefold: 

• A line outage identification model is proposed with 
consideration of phasor angle measurements with bad 
data. Using this model, we not only can understand the 
influence of bad data on the identification problem but 
also can design a criterion to aid line outage detection. 
Particularly, location identification for line outage and bad 
data can be viewed as a sparse error detection problem, 
which permits us to identify them by leveraging on more 
recent techniques in compressive sensing' [13]. 

• We provide an effective algorithm for line outage iden¬ 
tification. Unlike several of prior work (even without 
bad data), our scheme does not require prior information 
of the number of line outages and the noise variance. 
Specifically, all the required knowledge is learned as part 
of the identification procedure. 

• The developed algorithm operates in a message pass¬ 
ing fashion, which greatly exploits the inherent sparsity 

’Compressive sensing is a signal processing technique for efficiently 
reconstructing a signal from an undersampled set of linear transformations. 
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Structure of power networks and thus leads to very low 
complexity. Comprehensive experimental studies show 
that the whole identification procedure can be completed 
in real-time even over a large number of bus systems 
(e-g -5 < 1 second for a 2736-bus system). 

II. System Model and Problem Formulation 

A. DC Power Flow Model 

We consider a power transmission network with N buses 
and L transmission lines. Let ,yV = {1,. .., iV} be the set of 
buses and = {1,..., L} be the set of transmission lines. For 
the power transmission network, we adopt the most popular 
variant of the DC power flow model [14], in which the power 
flowing from buses m to n along line I G can be presented 
as ^ 

Pnm — iPn (1) 

^nm 

where Xnm = x^an represents the reactance between buses n 
and TO, and and 9^ are their respective voltage phasor 
angles. 

Let pn be the nodal injection for bus n. The nodal flow 
conservation constraint state that the amount of power injected 
into bus n must be equal to the amount that flows out of it, 
which can be algebraically expressed as 

Pn — ^ ' Pnnii (2) 

where M^(n) denotes the set of neighboring buses connected 
to bus n. Then (2) together with (1) yields the following linear 
DC power flow model in matrix form 

P - B0, (3) 

where p = [pi • • ■Pn]'^ € 0 = [0i • • • 9^]'^ G and 

B = [B„rn] € with its (n, mjth entry given by Bnm = 

“SL7 if TO e M"(n) and m ^ n, B^m = if 

n = m, and Bnm = 0 otherwise. 

Recall that line I connects buses n and to. If we define the 
i-th element of line I incidence vector m/ as 

{ 1, \f i = n, 

— 1, if i = TO, (4) 

0, otherwise, 

then B in (3) can be expressed as [8] 

B = MD,M^ = V -mimf, (5) 

Xl 

1^1 ^ 

where D^, is a diagonal matrix with as its Lth diagonal 
entry, and M = [mi • • • m^] is the N x L bus-line incidence 
matrix. 

B. Power Line Outages 

From (3), we see that the relationship between the injected 
power vector p and the pre-event phasor angle vector 6 is 
dictated by the susceptance matrix B which is topology- 
dependent. Following [2,3], we assume that the post-outage 


grid remains connected when outages occur on the transmis¬ 
sion lines. As the interconnected grid have reached a stable 
post-event state, the post-event power flow can be expressed 
by 

p' = B'0' = p + r], (6) 

where B' and 9' are the post-event susceptance matrix and the 
post-event phasor angle vector, respectively, and rj denotes the 
small perturbations between p' and p, usually modeled as a 
Gaussian noise vector with zero mean and covariance matrix 
[15]. 

To reflect variations in the post-event, we write 

B' = B - AB and 9' = 6 + A0, (7) 

where AB and A9 represent variations of the susceptance 
matrix and the phasor angle vector, respectively, between the 
pre- and post-event power systems. Using the notations of (5), 
AB can be expressed as 

AB = —m/mf = MD^-diag (So) (8) 

where C denotes the set of the lines in outage, and 
So = [so,i • • ■ So.l]^ is an L-dimensional binary vector whose 
element Sq,/ = 1 if the /-th line belongs to Co and So^i = 0 
otherwise. 

Substituting (7) into (6) yields 

y = BA0 = AB0' + rj (9) 

= MDa;diag (M'^0 ')so- f J7, (10) 

where the last equality follows from the fact that 

diag (so)M^0' = diag (M^0 ')so. By introducing the nota¬ 

tion 

=MD,diag(M^0'), (11) 

we then arrive at 

y = A0/So-fJ7. (12) 

Here, the notation Agi G indicates that matrix Agi 

depends on 9'. Note that since B and A9 are available, y 
can be obtained by its definition in (9). In addition, since the 
pre-event network topology (i.e., M and D^,) as well as the 
the post-event phasor angle vector 9' are known, Agi is also 
available by (11). Therefore, with (12), the power line outages 
can be identified by solving 

PI : So = argmin J"! (so;y, Ae/), (13) 

soe{o,i}^ 

where is a cost function of Sq associated with the model 
in (12). For example, (So;y, Ag/) = Ijy-Ae/SoH^ is 
adopted in [2, 3, 8,9] for line outage identification applications. 

C. Power Line Outages with Bad Data 

When the phasor angle measurements {9,9') are accurate, 
recent reports [2,3,8,9] have verified the efficacy of Problem 
P1 for line outage identification. However, if some measure¬ 
ments of {9,9') are erroneous or bad. Problem PI will result 
in incorrect line outage identification. To better understand 
the line outage identification problem with bad data, we aim 
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to develop a corresponding line outage identification model of 
(12) while some bad datums are present in 6'} We denote the 
corrupted measurement of 9' by 

~e' = e' + el (i4) 

where 9'^ is an unknown vector with its n-th entry being 
nonzero only if the entry is a bad datum. Similar to (7), we 
let 

e' = 6/ + A6/b (15) 

be the post-event phasor angle vector. Note that A0b contains 
not only variations of the phasor angle vector between the pre- 
and post-event power systems but also the bad data. 

In this case, y in (9) becomes 

y = BASb = B(A6> -P 6»'b) = AB6/' + B6/J, rj, (16) 

where the second equality follows by simply substituting the 
dehnitions in (7), (14), and (15), and the last equality follows 
by the equality in (9). Comparing (16) with (10), we see that 
B0b in (16) is the effect due to the bad data. Recalling from 
(14), 0b is n sparse vector. To clarify the effect due to the bad 
data, we let £b C be the set of lines, whose element are 
those lines connected to the faulty buses. Also, let Sb be an 
L-dimensional binary vector whose element Sb,z = 1 if the 
(-th line belongs to £b and Sb,/ = 0 otherwise. Thus, we use 
(7) and these notations to write 

B0'b = (AB-hB')0L = (AB-t-ABb)0', (17) 

where ABb = MDa;diag(sb)M^.^ 

Recalling the dehnition of A from (11) and using (14), we 
write 

Ag- = MD,diag(M^0 ) = Ae> + A^'. (18) 

Then substituting (17) into (16) shows that 

y = AB(0' + 0[,) + ABb0'b + r7 

= Ag'So -I- Ag/Sb -f T] 

= Ag-(So-h Sb) - Ae-Sb-t-ry, (19) 

where the second equality follows the similar algebraic step 
in (10), and the last equality follows by (18). By introducing 

s A So-I-Sb and e A — Ag'Sb, (20) 

we thus arrive at 

y = Ag'S + e + rj. (21) 

Note that s is still a binary vector. In addition, since matrix 
Agi and vector Sb are sparse, e G is also a sparse vector, 
which results in sparse contamination on y. See an example 
of e in Figure 1. 

According to the discussion in Section II-B, matrix A^/ is 
available while s, e, and r/ are unknown. Clearly from (21), as 
the bad data are present, the line outages cannot be effectively 
identihed using Problem P1 in (13). This is not only because 

^Bad data could also be present in 6. Since we only utilize the differences 
between the pre- and post-event measurements, the bad data in 0 can be 
included in 0'. 

^We notice that = 0 if £o H £b = 0 and ^ 0 otherwise. 
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Fig. 1. A realization of e for a 300-bus system with 5 faulty buses. 
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Fig. 2. A six-bus system. 


the additional e G contaminates y but also because both Sq 
and Sb are involved in s. To address this problem, we propose 
to estimate both s and e from y by solving the following 
optimization problem: 

P2 : (s,e)= argmin J'a (s, e; y, A^/), (22) 

sG{0,l}^,eGR" 

where T 2 is a cost function of (s, e) associated with the model 
in (21). However, note that even though s can be successfully 
estimated via Problem P2, the locations of the line outages 
are still unknown. As mentioned previously, s defined in (20) 
contains the location information of both the line outages and 
the bad data. In subsequent sections, we hrst provide a way 
to separate So from s in Section III, and postpone solving 
Problem P2 to Section IV. 

III. Line Outage Identieication with Bad Data 

To start with, we hrst assume that s has been obtained 
successfully from Problem P2, i.e., s = Sq -I- Sb. Before 
proceeding, we make the following dehnitions for ease of 
exposition. Recall the sets of lines Co and Ch, whose elements 
are the lines in outage and the lines connected to the faulty 
buses, respectively. Let Mo and A4i be the sets of buses 
associated with Co and £b. respectively. Specihcally, set Afo 
contains all the buses involving the line outages Co, and set 
JVh contains all the buses connected by the lines of £b- Let 
£h be the set of the faulty buses. Since each line connects 
to two buses, 5b is only a subset of, but not equal to, A4i- 
For a better understanding on these dehnitions, we provide an 
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example as shown in Figure 2, where line 4 is in outage and 
bus 1 is faulty. Therefore, Lo = {4}, No = {3,4}, £b = {1}, 
A = {1,3}, andA4 = {l,2,3}. 

Next, we provide the ways to 1) separate So from s and 
2) recover Q' from its corrupted measurement Q , which are 
referred to as the separation phase (or S-phase) and the 
recovering phase (or R-phase), respectively. 

S-Phase —Recall the bus-line incidence matrix M from (4) 
and (5); see also Figure 2 for an example. If we assume that 
there are at most one line in out^e on each bus, then set Mo 
can be included in {n G \Mni\si = 1}. Therefore, 

we can separate the faulty buses from s by 

4=|nG^:^|M„i|si>l|. (23) 

Clearly, if £b = 0, it means that no bad data is present. With 
£h, we further define 


£b= ^ |M„,| >oL (24) 

which thus induces the following set 

■.Y,\Mni\>0y (25) 

leCb J 

We can see that £b = ^-h = C-b and Mb = Mb- These 

relations can be easily understood through the example in 
Figure 2. With £b, we can determine Sb (the estimate of Sb) 
by setting Sb_/ = 1 if the (-th line belongs to £b and s'b^; = 0 
otherwise. Eventually, we obtain Sq = s — Sb (the estimate of 
So), and complete the S-phase. 

However, notice that the above argument is based on the 
assumption of at most one line outage on each bus. If this 
constraint is relaxed, 5b shall contain some of buses connected 
by such outage lines. It turns out that Mb shall contain some 
buses in Mo- Fortunately, this confusion can be eliminated in 
the subsequent R-phase. 

R-Phase —The aim of this phase is to recover 6' from its 
corrupted measurement 0 . Letting yb = A^'S — y, we then 
use (19) to write 

yb = Ae/Sb - I? = MDa;diag(sb)M^0' - -q, (26) 

where the second equality follows by the similar algebraic 
step in (10). Notice that not all the entries of 9' should be 
estimated. Only a few corrupted measurements of 9', whose 
locations have been identified by 5b, should be recovered. 

Toward this end, we first make the following definitions. 
For any vector a G and index set a C {!,..., M}, we 
denote the (sub)vector that lies in the entries of a indexed 
by a as [ajo,. Similarly, for any matrix A G we 

denote the (sub)matrix that lies in the rows and columns of A 
indexed by a as [A]q. The cardinality |a| denotes the number 
of members of a- Then from (26), we find that recovering 
[9 (the corrupted measurements of 9') is possible through 


solving the following optimization problem: 


= argmin 

s.t. 0^ = 0;, Vn^ 5b. (27) 

Here, the estimate of 9' is denoted by 9 . Since 5b C Mb, 
the estimate [9 has involved the estimate [9 ]g-^. Problem 
P3 can be easily solved by eliminating the known variables 
{6*^ = 0'^,yn 7 ^ 5b} from its objective function, and then 
applying the linear least square method to solve the unknown 
variables. ^ 

If 5b = 5b, the above procedure has recovered [9 from 
the corrupted measurements. However, as mentioned in the S- 
phase, if there are more than one line outages on a bus, such 
a bus cannot be identified as Mo, but is included in Mb- In 
this case, some of lines in £b should be in the outage state. 
That is, the zero-one state of [sbj^i^ is uncertain rather than 
[sbl^i^ = 1 as given in the S-phase. To determine its state, an 
exhaustive search (ES) algorithm is employed to evaluate all 
possible combinations [sbjf G {0, l}!^*’!, and then find the 
combination that yields the minimum error of Problem P3. As 
|£b| is very small (e.g., |£b| = 2 in Eigure 2) and does not 
expand with the number of buses, ES for phase recovering can 
be realized in real time. Consequently, we have simultaneously 
identified Sb and recovered 9' and therefore completed the R- 
phase. 

The step-wise implementation procedure of the proposed 
line outage identification algorithm is summarized as Algo¬ 
rithm 1. In short, we first obtain (s, e) by solving Problem 
P2 (lines 1-2 of Algorithm 1). Next, Sb is separated from s 
via the S-phase (lines 3^) and then refined by the R-phase 
(lines 5-6). The locations of line outages are finally indicated 
by So at line 7. 


[yb]^^ - [MD.diag(sb)M^]^j0']^^ 


Algorithm 1: Line Outage Identification with Bad Data 

input : The pre-event phasor angle vector 6, the post-event phasor 
angle vector 0 , and the the pre-event susceptance matrix B 
output: The indicator vector for line outages So 

begin 

1 Generate y = B(0^ — 0); 

2 Estimate (s', e) by using Problem P2 in (22); 

Separation Phase: 

3 Find sets C\^, Mh by using (23), (24), and (25), 
respectively; 

4 Get s'b from C\^\ 

Recovering Phase: 

5 Generate yb = A^/’s — y; 

6 Estimate [0 and refine [sb]^^ simultaneously by using 
Problem P3 in (27); 

7 Return “So = "s — ^b- 


IV. Estimation Algorithm 

Now, we consider the estimation of (s, e) based on Problem 
P2 in (22). This task seems rather impossible because the total 
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number of unknown variables L + N is much larger than the 
number of observations N. Nevertheless, it is noted that s and 
e are sparse vectors (see the discussion in Section II-C). By 
exploiting the sparsity property of (s, e), we can estimate them 
accurately by leveraging on recent techniques in compressive 
sensing (CS in brief in the sequel), see [13] for a recent 
exhaustive list of the algorithms. 

In the CS literature, one popular suboptimal and low- 
complexity estimator is fi-regularized least-squares (LS), 
a.k.a. least absolute shrinkage and selection operator (LASSO) 
[16]. In this context, the cost function of Problem P2 is given 
by 

•7^2 (s, e; y, A) = ||y - As - e|| 2 -f As||s||i-f Ae||e||i, (28) 


where 

(32) 

denote the marginal posterior probabilities of s; and e„, 
respectively. Here, notation stands for the set of all 
entries in a except for the entry indexed by for 

example, sy = [si • • • s/_i s;+i • • • sl]'^ and = 

{1,... , n — 1, n -I- 1,..., N}. 

From (29), to obtain the posterior probability, the prior 
distributions of s and e are required. For line outages, it is 
reasonable to assume them to be independent and identically 
distributed (i.i.d.) random variables with Bernoulli distribution 


where Xs,Xe > 0 are the regularization parameters. We 
here and hereafter denote A := A^/ when it is not useful 
to specify the dependence on 6 for matrix A. If the phasor 
angle measurements are accurate (i.e., e = 0), (28) reduce 
to (s;y, A) = ||y — As ||2 -I- As|js||i. This cost function is 
adopted by [8] for the line outage identification problem with¬ 
out bad data. It is known that large values of the regularization 
parameters result in more sparsity in (s, e). However, the best 
choice of (A*, Ae) highly depends on the statistical properties 
of (s, e) (e.g., the sparsity of (s, e)) and the noise variance 
[17], which could be difficult to determine in practice. In 
addition, LASSO is highly suboptimal and thus would not be 
quite suitable for the power line outage identification problem 
which requires very high reliability. The remainder of this 
section is devoted to devising a fast near-optimal algorithm 
for estimating (s, e) from y. 


A. Theoretical Foundation 


To develop our algorithm, we adopt the probabilistic 
Bayesian inference because this approach provides a founda¬ 
tion for achieving the best estimates in terms of mean-squared 
error [18]. Most importantly, the Bayesian inference can be 
implemented by a factor-graph framework which leads to low- 
complexity message-passing solutions. 

Bayesian inference begins with deriving the posterior prob¬ 
ability according to Bayes’ rule: 


P(s,e|y) 


P(y|s,e)Ps(s)Pe(e) 

P(y) 


(29) 


where P(s) and P(e) are the prior distributions of s and 
e, respectively, P(y|s, e) is the likelihood, and P(y) is the 
marginal likelihood. Specifically, the likelihood derived from 
the conditional distribution of y based on (21) is given by 


P(y|s,e) 



1 



||y-As-e ||2 


(30) 


With P(s, e|y), the marginal posterior probabilities of s and 
e can be obtained by P(s|y) = J P(s,e|y)(ie and P(e|y) = 
i}N P(s, e|y), respectively. Then the Bayes-optimal 
way to estimate s and e is given by [18] 


Si 


Si Q{si) and e„ = / e„Q(e„)de„, 

^<£{ 0 . 1 } 


(31) 


Ps(s; = l;po) = Po = 1 - Ps(si = 0;Po)- (33) 

Then the prior probability of s can be expressed as 

L 

Ps(s;Po) = f[ Ps(s;;Fo)- (34) 

1=1 

Also, from Figure 1, we see that e consists of sparse impulsive 
components, and the impulsive components have significantly 
different variances. These observations motivate us to model 
the elements of e = [e„] by a Bemoulli-Gaussian-mixture (B- 
GM) distribution: 

K 

Pe(t^n5 Pi Pi ^ ) P0^(,^n') T ^ ^ P/ci ^fc)i (35) 

k=l 

where S(-) denotes the Dirac delta, N{en] PkiCr].) denotes 
a Gaussian probability density function (pdf) with mean 
and variance cr^, is the mixing probability of the fcth GM 
component, and Pk = 1- The value of K indicates 

the number of different variances in e. In the simulations of 
Section V, we use K = 3. Letting ui = the prior 

probability of e is written as 

N 

Pe(e;u;) = P[ Pe(e„;a;). (36) 

n—1 

Note that the true distributions of e could not be the B-GM 
distribution. However, our numerical results will demonstrate 
that the choice of the B-GM distribution is perfectly fine. 

Even with the probability models of s and e, there are two 
critical issues when implementing the optimal Bayes estima¬ 
tion (31). First, the marginal posterior probabilities Q(s/) and 
Q(e„) in (32) are not computationally tractable. Second, the 
prior parameters (po,c^) for (Ps, Pg) are unknown. To obtain 
an estimate of {Q(s/), Q(e„)}, we use belief-propagation 
(BP) which is an iterative message passing algorithm in [19, 
20]. Meanwhile, we use the expectation-maximization (EM) 
algorithm in [21] to learn the prior parameters {po,u)). We 
describe the two algorithms and their connections next. 

B. Message Passing Algorithm 

In this subsection, we develop a computationally efficient 
algorithm for calculating Q{si) and Q(e„), which, in partic¬ 
ular, is based on the approximate message passing (AMP) 
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Fig. 3. Factor graph for the six-bus system in Figure 2. 


algorithm from [19,20,22]. For conciseness, we often omit 
Po from Ps(s;po) [or Ps{si;Po)], and u: from Pe(e;a;) [or 
Pe(f'n; ^)]- 

AMP can be derived from the perspective of BP, which is a 
technique to factorize the posterior probability into a product 
of simpler probability functions. Let .if (n) be the set of lines 
connected bus n, and ^{l) be the set of buses connected by 
line 1. For ease of notation, is denoted by S(„). With 

these notations, the likelihood in (30) can be expressed as 


N 

P(y|s,e) = P(y„|s(„),e„) 

71 — 1 


1 

Z 



n—1 




2 


(37) 


where Z denotes a normalization factor. Substituting (34), (36), 
and (37) into (29), we obtain a factor graph representing the 
factorization of (29) as 


N 

P(s,e|y) = Y[ P(t/„|s(„) 


n Ps(Si) Pe(e„). 


n—1 


.l£Se{n) 


(38) 

The factor graph for the six-bus system shown in Figure 2 
is depicted in Figure 3 where a circle represents a variable 
node associated with the indicator of line outages si and 
contamination e„, whereas a square indicates a factor node 
associated with the sub-constraint function; i.e., P(t/n|s(„), e„) 
for bus n. For each variable si, there is an edge between a 
variable node I and a function node n if and only if line I is 
connected to bus n. 

In summary, BP can be regarded as a numerically efficient 
algorithm to obtain (32) based on the factorization in (38). 
The algorithm is done by a set of message passing equations 
which go from factor nodes to variable nodes and vice versa. 
Because of the inherent sparse structure of power networks, 
the computation of the marginal posterior probabilities is 
tractable using the BP algorithm. 


Algorithm 2: SwAMP Algorithm 


input 

output 

initialize 


Input A=[An,i\ e and y = [yn] S 

Return (s, e). 

SO = 1, = A/L, Vi, = 0, = 1/N, Vn. 

L? = = Vn. Vn 

t ^ 1; 

2 while ~ E 3i*d t < Ti 

nt ^ IVrllvI Vn; 


do 


12 

13 

14 

15 

16 
17 


18 

19 

20 
21 
22 

23 

24 

25 

26 


28 

29 


yt+l^ 


Ey6^(„) + ^e,n. Vn; 


— ghVn'^^, Vn; 

[fli'^2, • • • .•^L+iv] ■<— perniute({l, 2, • • • ,{L + N)}) 

for j = 1 to L+N do 


I ^ ij 

if Z < L then 

r’2 v-l-l 


„2^yt + l 


Kl) 

^ f.,2 

for i e A^(Z) do 






v; 


t-i-i 


■t+i. 

‘t+i’ 


.,^+1. 


+ A? — D* •); 

l,J x S,J S,J > ’ 


+- 9\{yr^ - yry 


else 


71 I 

1 - L; 



(S2„ 

y+i ^ 

+ Fn‘+') ; 

Rity 


{Vn 

jjt-i-i 

<— /e.l ( 

(Se%)‘ + ^ 

pt+1J. 

5 -^e, n y ? 

.y‘+i 

^e,n 

<— /e,2 ( 

’(Se%)‘ + 1 

pt+l^. 

5 -^e, n j ^ 


^ Fn + ' 

1. 



^vji+^+vi+y- 

- 

^e,n’ 



-P (ei+^ - 

-e^) -Vi 


/t + 1 




Prior parameter learning: 

Update the outage probability po and the B-GM priors uj 
using lines 1^ in Table I; 

Update the noise variance 5^ using line 5 in Table I; 

- t-l- 1 ; 


However, the computational complexity is still high because 
the messages are continuous probabilities. Therefore, we resort 
to AMP, a variant of BP, which was initially proposed by 
Donoho et al. (2009) [19] to solve a linear inverse problem in 
the context of CS. Applying the AMP technique, we have 
developed an AMP based algorithm for estimating (s, e), 
which is summarized as Algorithm 2. Wherein, lines 3-5 
correspond to the messages from variable nodes {s/,e„} to 
factor nodes, and those lines 9-17 and lines 16-20 correspond 
to the messages from factor nodes to variable nodes {s;} 
and variable nodes {e„}, respectively. Our version of AMP 
is closer to [22], referred to as the swept AMP (SwAMP), 
which slightly modifies the parallel update patten of AMP to a 
sequential, or swept, one. We hnd that SwAMP is particularly 
useful to our case because A is a very sparse matrix. In fact, 
a sparse matrix is a very advantageous situation in terms of 
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computational efficiency. Due to space limitations, we refer 
the interested reader to [19,20,22] for more information. 


C. Prior Parameter Estimation 


In the above AMP, the prior parameters {po,u)) are treated 
as known. We now apply the EM algorithm in [21] to learn 
the prior parameters {po,uj). The EM algorithm is an iterative 
technique that increases a lower bound on the marginal likeli¬ 
hood P(y;p^,a)*) at each iteration. Briefly, given a previous 
parameter estimate (p^, a)*), the EM update for the parameters 
is achieved by [21] 


= argmaxE|logP , (39) 


Po 


where the expectation takes over the posterior probability of 
(s,e). 

A manipulation for dealing with the optimization (39) was 
developed in [21]. Eollowing similar steps in [21], we can 
obtain the EM update of the prior parameters (po,u>) and 
the noise level cr^, which are summarized as Table I. These 
parameter estimation procedures have been installed in lines 
27-28 of Algorithm 2. 


TABLE I 




EM update of prior parameters (po 


1 Po si. 


2 


~T~ 

L 


rzr 

A= 
= 1 


3 


el. 


2'|t+l 


4 (91) , 

for fc = 1, • • • , X, where 
V’o.n = PoAf (e[j;0, 

v-L, 




= 1 Bk,'. 


’3'fc.n “ l/(3M2 + l/t)| „ 


V’fc.n = PfcA" (e[,;/f*, (5* )2 

_ ^k' = l 

^k.n qijt ,j,t ’ 


/•t _ i _ 


EM update of noise level cr^ 


5 






V. Simulation Results and Discussion 

In this section, we conduct computer simulations to demon¬ 
strate the effectiveness and efficiency of the proposed line 
outage identification algorithm. Three typical IEEE benchmark 
power systems: IEEE 118-bus, IEEE 300-bus, and Polish 
2736-bus, are considered."* The software toolbox MATPOWER 
[23] is used to generate the phasor angle measurements 
corresponding to these power systems, as well as the pertinent 
power flows. The performance metrics of our interest are the 
identification rate (or the hit rate) and the false alarm rate; 
specifically, if £o denotes the estimate set of the lines in an 

4similar to the other line outage identification schemes, e.g., [2—4, 8,9], the 
proposed scheme is also unable to directly distinguish the outage of a fraction 
of multiple lines that connect the “same” set of buses. For this reason, we 
slightly modify the systems (i.e., the duplicated lines that connect the same 
pair of buses are merged into a single line) to exclude this particular scenario. 


outage,^ the identification rate and the false alarm rate are 
defined by 

■ nCol , , ICoH. 

- 1 —and kp = 1 — -^ 

>Co| \Co 

respectively. We will consider the two metrics simultaneously; 
otherwise, it is known that the identification rate can be triv¬ 
ially high with very high false alarm rate. All the performance 
results (i.e., the rates) shown in this paper are based on 1,000 
randomly selected locations in outage for each number of 
line outages. Also, 10 independent noise-perturbed realizations 
are generated for each selected location in outage, where the 
standard deviation (STD) of noise is set equal to 0%, 1%, or 
3% of the average pre-event power injection. 

A. Test Case A (without Bad Data) 

In the first experiment, we examine the capability of Algo¬ 
rithm 2 for line outage identification when there are no bad 
data present. In this case. Problem P2 reduces to Problem P1 
since only the estimate of s is required. Along this setting, 
we evaluate the corresponding performers under two different 
prior knowledge of the system-state. 

Test Case A-1; In the first case, we assume that the number 
of line outages |£o| and the noise variance are available. 
We briefly refer the priori information to as the (statistical) 
system-state information (SSI). Note that our proposed method 
does not require the availability of the SSI because they can 
be learned as part of the estimation procedure (i.e., lines 27- 
28 of Algorithm 2). However, the SSI is required for the 
comparison schemes: 1) the LASSO scheme [8], 2) the cross¬ 
entropy optimization (CEO) scheme [9], and 3) the ES (or 
exhaustive search). Eor fair comparisons, we assume perfect 
prior SSI for all the schemes in this experiment. The results 
of ES serve as the performance benchmark. However, because 
the overall complexity of ES is of the order O ES is 

only available at most for |£o| = 2. Note that since the number 
of line outages is already known, false alarm is meaningless. 
Therefore, we only consider the identification rate k\ in this 
case. In addition, note that both the output indicator vectors 
So by the LASSO and the SwAMP are real vectors. These real 
values can be interpreted as the outage probabilities. Since 
the number of line outages is known to be jEol^ we select line 
outages from the first |£o| largest probabilities. 

The identification rates of various algorithms with prior 
SSI are listed in Table II (the first four columns after Noise 
STD). As can be seen, under the same noise level, the results 
obviously show that all the identification rates achieved by 
Algorithm 2 are superior to all those by LASSO and very 
close to those by the CEO and ES. Particularly for the Polish 
2736-bus system, the identification rates of all the algorithms 
become delicate as injection noise level increases. Note that 
although the computational complexity of the CEO is smaller 
than the ES scheme, it is still much higher than Algorithm 2. 
These indicate that Algorithm 2 is more suitable than the 
others in respect of detection reliability and computational 
efficiency. 

^Recall from (8) that Co denotes the set of the lines in outage. 


Co 


(40) 
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TABLE II 

(Test Case A) Identieication and false alarm rates of various algorithms. 


N 


l^ol 

Noise 

Algorithms (Test Case A-1) 

Alg. 2 (Test Case A-2) I 


STD 

ES 

CEO 

LASSO 

Alg. 2 

K, 

Kp 




0% 

100 . 0 % 

99.8% 

99.4% 

100 . 0 % 

98.4% 

0.0% 



2 

1% 

99 . 7 % 

99.6% 

98.5%i 

99.6% 

98.4% 

0.0% 

118 

179 


3% 

98 . 1 % 

97.9% 

95.8%, 

98 . 1 % 

96.2% 

0.7% 


0% 

NA 

99.8% 

99.3% 

100 . 0 % 

98.9% 

00% 



3 

1% 

NA 

99.4% 

98.3% 

99 . 6 % 

98.7% 

0.3% 




3% 

NA 

98.0% 

96.2% 

98 . 2 % 

96.8% 

0.8% 




0% 

100 . 0 % 

99.9% 

99.7%, 

100 . 0 % 

97.6% 

0.0% 



2 

1% 

98 . 8 % 

98.7% 

97.5%, 

98 . 8 % 

97.5% 

0.0% 

300 

409 


3% 

96 . 9 % 

96.8% 

95.3% 

96 . 9 % 

95.7% 

1.1% 


0% 

NA 

100 . 0 % 

99.6% 

99.9% 

98.2% 

00% 



3 

1% 

NA 

99.0% 

97.8% 

99 . 1 % 

98.1% 

0.0% 




3% 

NA 

97 . 3 % 

95.6% 

97.2% 

96.0% 

1.0% 




0% 

NA 

99 . 9 % 

99.7% 

99 . 9 % 

97.8% 

0.0% 

2736 

3495 

3 

1% 

NA 

90 . 9 % 

90.1%, 

90.5% 

88.2% 

6.3% 




3% 

NA 

77 . 3 % 

76.0% 

77.1% 

74.9% 

9.1% 


Test Case A-2; Next, we consider the cases without the 
prior SSI (i.e., the number of line outages and the noise 
variance). Because those comparison schemes mentioned in 
Case A-1 cannot work effectively as the SSI is unavailable, 
their results are not included in the following experiments. 
Note that as the number of line outages is unknown, we 
estimate £o via 

£o = {/ e if : So,; > r} , (41) 

where 0 < r < 1 is the critical number. That is, we 
perform a hard decision from the real vector Sq. It is obvious 
that a lower value of r leads to the higher identification 
rate k\ while also incurring the higher false alarm rate kr. 
Therefore, a proper choice of r is important. According to 
our experiments, we find that r = 0.5 can generally yield 
good results. The corresponding results are listed in the last 
two columns of Table II. Comparing the identification rates k\ 
of Algorithm 2 with prior SSI (column 8) and those without 
prior SSI (column 9), we see that the identification rates are 
only slightly degraded due to the lack of the prior SSI. In 
addition, only a very low false alarm rates are arisen. Even 
for the worst case (except for the inherently delicate 2736- 
bus system), the false alarm rate is only 1.1%. These results 
illustrate that Algorithm 2 in conjunction with (41) provides a 
highly effective approach for line outage identification even if 
the priori SSI is unknown. All the following experiments will 
be tested without priori SSI. 

B. Test Case B (with Bad Data) 

In the second experiment, we test the proposed algorithm 
in the scenarios in which bad data are present at some 
PMUs. In the simulations, bad data generated by 

the continuous uniform distribution U{—9,9'^, where 9 is 
determined by l^"l being the pre-event phasor 

angle. As such, bad data being of similar scale as the common 
phase angles are concealed in the true data, which makes them 
difficult to be detected by conventional statistical testes. We 
apply Algorithm 1 which outputs the indicator vector for the 
line outage Sq by solving (s, e) from Problem P2 followed 
by the S-phase and the R-phase. Since s from Problem P2 


TABLE III 

(Test Case B) Performances of Algorithm 1 as the buses with 

BAD DATA AND THE ASSOCIATED BUSES WITH LINE OUTAGES ARE 
INVOLVED. 


N 

L 

l^ol 

l^’bl 

Noise 

STD 

Kl 

Kp 





0% 

97.4% 

97.8% 

40.1% 

2.5% 




1 

1% 

96.5% 

96.7% 

41.8% 

4.3% 

118 

179 

3 


3% 

94.7% 

94.7% 

42.6% 

6.3% 



0% 

92.6% 

94.8% 

57.6% 

8.4% 




2 

1% 

92.6% 

93.7% 

58.1% 

10.1% 





3% 

91.4% 

92.1% 

59.1% 

13.3% 





0% 

97.5% 

97.7% 

41.3% 

2.7% 




1 

1% 

96.5% 

96.1% 

40.6% 

4.8% 

300 

409 

3 


3% 

94.2% 

93.5% 

41.9% 

7.7% 



0% 

94.4% 

95.0% 

55.8% 

7.9% 




2 

1% 

93.9% 

93.1% 

56.9% 

10.4% 





3% 

91.8% 

90.5% 

58.1% 

14.8% 





0% 

97.7% 

98.3% 

57.0% 

3.2% 

2736 

3495 

3 

2 

1% 

92.6% 

81.7% 

61.0% 

26.3% 





3% 

86.8% 

70.1% 

63.7% 

40.1% 


TABLE IV 

(Test Case B) Performances of Algorithm 1 as the buses with 

BAD DATA AND THE ASSOCIATED BUSES WITH LINE OUTAGES ARE 
COMPLETELY SEPARATED. 


N 

L 

l^ol 

141 

Noise 

STD 

Kl 

Kp 





0% 

99.1% 

99.0% 

42.3% 

5.2% 




1 

1% 

98.0% 

97.8% 

43.3% 

7.1% 

118 

179 

3 


3% 

96.0% 

95.8% 

43.7% 

9.3% 



0% 

97.9% 

97.9% 

58.6% 

14.0% 




2 

1% 

97.3% 

97.1% 

58.6% 

14.9% 





3% 

95.3% 

95.1% 

59.7% 

17.8% 





0% 

99.0% 

98.9% 

35.4% 

6.6% 




1 

1% 

98.1% 

97.9% 

36.0% 

7.5% 

300 

409 

3 


3% 

95.7% 

95.5% 

37.2% 

9.6% 



0% 

99.2% 

99.1% 

50.8% 

11.4% 




2 

1% 

97.9% 

97.7% 

51.9% 

13.7% 





3% 

95.6% 

95.4% 

52.8% 

16.8% 





0% 

94.7% 

94.7% 

59.8% 

6.8% 

2736 

3495 

3 

2 

1% 

88.9% 

88.9% 

63.8% 

31.9% 





3% 

76.3% 

76.2% 

67.5% 

44.7% 
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TABLE V 

Average running time of Algorithm 1 (in seconds) 


N 

L 

l^ol 

Noise 

STD 

Test Case A 
l^’bl = 0 

Test Case B 
l^’bl = 2 

118 

179 

3 

3% 

6.70 X 10-^ 

2.35 X 10-^ 

300 

409 

3 

3% 

1.61 X 10-^ 

4.10 X 10-^ 

2736 

3495 

3 

3% 

6.24 X 10-^ 

7.82 X 10-^ 


is a real vector, we transfer it to a binary vector by using the 
same hard decision technique as that of (41). 

To evaluate the proposed algorithm, we considered two 
different kinds of bad data locations: the buses with bad data 
and the associated buses with line outages are i) involved or ii) 
completely separated. The former case is practical (and more 
challenging) because it is very likely that the associated buses 
with line outages result in faulty PMUs. Table III and Table IV 
list the corresponding results for the two cases with |£o| = 3 
for various numbers of bad data |£b| = {1, 2}. The results 
of K\ and Kf contain two columns. The values in the first 
column correspond to the results after the S-phase (the first 
phase) of Algorithm 1 and the second columns are the final 
results of Algorithm 1 (i.e., after the R-phase). Recall that 
the R-phase is mainly used to eliminate the state uncertainty 
between the bad data and line outages. We see that the false 
alarm rate Kp can be greatly reduced by the R-phase, and the 
final identification rate k\ remains quite reliable. These results 
illustrate the effectiveness of Algorithm 1 with bad data. 

C. Running Time 

Finally, we discuss the complexity of the proposed line 
outage identification algorithm. Given that line 2 of Algorithm 
1 (i.e.. Algorithm 2) dominates the computational cost, the 
complexity of the proposed line outage identification method 
can be approximately analyzed based on the total number 
of multiplications required by Algorithm 2, which requires 
a total of \J^'{n) \ + 8N + 24L + 13KN multiplications 
for each iteration.® To better grasp the complexity of the 
entire procedure, we summarize the average running times of 
Algorithm 1 in Table V for the test cases A and B. Each 
running time is obtained by averaging over 10,000 random 
samples on a 64-bit Windows 7 PC equipped with a 3.3- 
GHz Intel Core E3-1230 CPU and 16GB of memory. In our 
simulator. Algorithm 1 is implemented based on MATLAB 
2013b, wherein line 2 (i.e.. Algorithm 2) is written in the 
C programming language with e = 10“® and T^ax = 200. 
Table V shows that the average running time increases with 
the system size and slightly increases as bad data are present. 
It can be seen that our algorithm is highly efficient; the whole 
identification procedure can be completed within 1 second 
even for the large 2736-bus system. 

VI. Conclusion 

In this paper, we developed a framework for identifying 
multiple power line outages based on the PMUs’ measure- 

^Recall that K defined in (35) indicates the number of different variance 
in e and we use K = S. 


ments in the presence of bad data. Conventionally, the loca¬ 
tions of line outages and bad data are indistinguishable. Ex¬ 
ploiting the property of power network topology, we presented 
an algorithm to identify the locations of line outage and re¬ 
cover the faulty measurements simultaneously. The algorithm 
does not require any prior information of the number of line 
outages and the noise variance. Simulations using benchmark 
power systems validated the effectiveness and efficiency of 
the proposed scheme. In particular, we showed that the whole 
identification procedure can be completed within seconds even 
for a large-scale power system, which makes our scheme 
suitable for real-time applications. 
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